Speaker identification using probabilistic PCA model selection

نویسندگان

  • Jen-Tzung Chien
  • Chuan-Wei Ting
چکیده

Gaussian mixture model (GMM) techniques are popular for speaker identification. Theoretically, each Gaussian function should have a full covariance matrix. However, the diagonal covariance matrix is usually used because the inverse of diagonal covariance matrix can be easily calculated via expectation maximization (EM) algorithm. This paper proposes a new probabilistic principal component analysis (PPCA) model for speaker identification. The full covariance of speaker’s data is considered. This model is originated from factor analysis theory. The probability distributions using PPCA are well defined. In particular, GMM and PPCA are found to be equivalent when using diagonal covariance matrix. In this study, we derive a novel PPCA model selection and establish models for different speakers. Applying PPCA model selection, we can dynamically determine the numbers of speech features and mixture components. Experiments show that PPCA achieves desirable speaker recognition performance with proper model regularization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dimension reduction for speaker identification based on mutual information

Dimension reduction is a necessary step for speech feature extraction in a speaker identification system. Discrete Cosine Transform (DCT) or Principal Component Analysis (PCA) is widely used for dimension reduction. By choosing basis vectors from basis vector pool of DCT or PCA which contribute more to data distribution variance or reconstruction accuracy of speech data set, we can transform th...

متن کامل

PCA Fuzzy Mixture Model for Speaker Identification

In this paper, we proposed the principal component analysis (PCA) fuzzy mixture model for speaker identification. A PCA fuzzy mixture model is derived from the combination of the PCA and the fuzzy version of mixture model with diagonal covariance matrices. In this method, the feature vectors are first transformed by each speaker’s PCA transformation matrix to reduce the correlation among the el...

متن کامل

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

Neural Network based Classification for Speaker Identification

Speaker Recognition is a challenging task and is widely used in many speech aided applications. This study proposes a new Neural Network (NN) model for identifying the speaker, based on the acoustic features of a given speech sample extracted by applying wavelet transform on raw signals. Wrapper based feature selection applies dimensionality reduction by kernel PCA and ranking by Info gain. Onl...

متن کامل

Speaker identification using relaxation labeling

A nonlinear probabilistic model of the relaxation labeling (RL) process is implemented in the speaker identification task in order to disambiguate the labeling of the speech feature vectors. Identification rates using the RL are higher than those using the conventional VQ (vector quantization) method.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004